Skip to content

support openai compatible model reasoning content in streaming response#13

Merged
sjy3 merged 2 commits intovolcengine:mainfrom
UnderTreeTech:main
Jan 4, 2026
Merged

support openai compatible model reasoning content in streaming response#13
sjy3 merged 2 commits intovolcengine:mainfrom
UnderTreeTech:main

Conversation

@UnderTreeTech
Copy link
Contributor

@UnderTreeTech UnderTreeTech commented Jan 3, 2026

support reasoning content in streaming response, therefore user can custom AfterModelCallback to filter thought response during streaming llm call.

@UnderTreeTech UnderTreeTech changed the title support reasoning content in streaming response support openai compatible model reasoning content in streaming response Jan 4, 2026
@sjy3 sjy3 merged commit d742344 into volcengine:main Jan 4, 2026
2 checks passed
@sjy3
Copy link
Collaborator

sjy3 commented Jan 4, 2026

image

When the thinking mode is activated, the web page will generate multiple messages. If you are willing to fix the above issue, please resubmit a pull request (PR).

@UnderTreeTech
Copy link
Contributor Author

@sjy3 Add a standard AfterModelCallback implementation that filters thought content. This allows flexible control over thought visibility:

func ThoughtFilterCallback(ctx agent.CallbackContext, llmResponse *model.LLMResponse, llmResponseError error) (*model.LLMResponse, error) {
    if llmResponseError != nil || llmResponse == nil || llmResponse.Content == nil {
        return nil, nil
    }

    var filteredParts []*genai.Part
    hasThought := false

    for _, part := range llmResponse.Content.Parts {
        if !part.Thought {
            filteredParts = append(filteredParts, part)
        } else {
            hasThought = true
        }
    }

    if hasThought {
        newResponse := *llmResponse
        newResponse.Content = &genai.Content{
            Role:  llmResponse.Content.Role,
            Parts: filteredParts,
        }
        return &newResponse, nil
    }

    return nil, nil
}

Usage:

agent, err := llmagent.New(llmagent.Config{
    Name: "MyAgent",
    Model: model,
    AfterModelCallbacks: []llmagent.AfterModelCallback{
        ThoughtFilterCallback,  // Filter thought content
    },
})

Benefits

  • Flexibility: Can be enabled/disabled per agent by including/excluding the callback
  • Consistency: Provides a standard approach across different OpenAI-compatible providers
  • No Breaking Changes: Fully opt-in via the existing callback mechanism
  • Frontend/Backend Choice: Decision to filter can be made at either layer
  1. Backend filtering: Add the callback to filter thought content before sending to frontend
  2. Frontend filtering: Don't use the callback, let frontend decide whether to display thought content based on part.Thought flag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants